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(S) Bending point extraction method for optical character recognition. 

(57) A method is provided for extracting bending points from character images for use in an optical 
character recognition procedure (28) that recognises the characters. In a preferred aspect a contour 
(boundary of strokes) of a character image is traced and strong curvatures are detected as bending 
points using heuristically determined (72) parameters with some attributes such as position, angle of 
curvature, convex or concave, and acuteness being provided as a data set output. 
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Field of the Invention 

The present invention relates to the generation of feature data for use in optical character recognition 
(OCR). More particularly, the invention concerns the acquisition of geometrical character feature data and to- 
5 pological information, specifically character bending points, for structural analysis and character classification 
in an optical character recognition system. 

Background Art 

10 In optical character recognition, selected character data representing character features of interest are em- 

ployed in a classification procedure that attempts to classify and thus recognise characters based on the char- 
acter features provided as input. Among the various character features proposed for optical character recog- 
nition, character "bending points" have been given substantial recent attention. A bending point represents a 
topological curvature feature having attributes such as position and acuteness of curvature. High character 

15 recognition rates have been achieved when geometrical information including character bending points are 
used for structural analysis and character classification in an optical character recognition system. For exam- 
ple, it has been reported (H. Takahashi, "A Neural Net OCR Using Geometrical And Zonal-Pattern Features" 
(October, 1991)) that bending point features can be used to produce superior recognition rates in neural net- 
work optical character recognition systems employing back propagation methods. 

20 Historically, the extraction of bending point information from input character data has been problematic. 

Characters may have multiple bending points and decisions must be made regarding the significance of each 
bending point feature such that insignificant features are excluded and relevant features are preserved. Com- 
plex algorithms have been proposed to identify appropriate extraction points. For example, I. Sekita et al, "Fea- 
ture Extraction of Handwritten Japanese Characters by Spline Functions of Relaxation Matching", Pattern Rec- 

25 ognition, Vol. 21, No. 1, pp. 9-17 (1988), discloses a time consuming spline approximation method. This method 
is said to require five times the CPU time of prior methods but is assertedly justified by improved character 
recognition rates. 

No proposals have been made to date for a bending point extraction method which provides good recog- 
nition rates without undue processing time. Accordingly, given the high recognition rates obtainable with prop- 
30 erly selected bending point data, there remains unsatisfied an evident need for a fast yet accurate bending 
point extraction method that overcomes the recognised deficiencies of existing procedures. 

Disclosure of the Invention 

35 Accordingly the invention provides a method for identifying bending points in a character image for use 

as input to an optical character recognition procedure, the method comprising the steps of inputting a picture 
element (pixel) array pattern of black and white picture elements representing a character image to be recog- 
nised, the pixel array pattern including a plurality of array positions representing continuous contours of the 
character image; scanning the pixel array pattern to trace one or more continuous contours of the character 

40 image and generating a list of contour points for each traced contour; determining for each contour point an 
acuteness value representing an angle of contour curvature and generating a list of acuteness values for each 
traced contour; dividing each acuteness list into contour groups, each contour group having a series of con- 
secutive points that are either all convex or all concave in curvature; extracting selected bending points from 
one or more contourgroups using heuristically determined parameters in one or more iterations; and generating 

45 a bending point data set output including a list of character bending points, their orientation and their acuteness. 

In a preferred embodiment the step of tracing one or more contours of the character image includes as- 
signing a tracing direction value to each contour point representing a direction to a next contour point, the step 
of assigning a tracing direction value including generating a pattern matrix having plural storage positions cor- 
responding to the contour points; the pattern matrix being generated by scanning the pixel array pattern with 

so a mask array, determining the number and position of black and white pixels appearing in the mask array, and 
assigning direction values to the pattern matrix storage positions based on the information determined from 
the mask array, the mask array being a two by two element array capable of indicating a total of fourteen pixel 
combinations which are used to assign a total of four tracing directions. 

Also in a preferred embodimentthe step of determining an acuteness value foreach contour point includes: 

55 generating a list of orientation values representing orientation directions from each contour point to a selected 
subsequent contour point, and wherein the acuteness value for each contour point is the angular difference 
in orientation value between the contour point and the selected subsequent contour point, wherein the step 
of generating a list of orientation values includes using an orientation matrix having plural storage positions, 
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each storage position containing a value representing an orientation direction and being addressable using x- 
y offset values representing a difference in x-y position between each contour point and a selected subsequent 
contour point; and wherein the step of determining an acuteness value for each contour point further includes 
smoothing the orientation values to provide enhanced orientation continuity between each contour point and 

5 a selected subsequent contour point. 

Preferably the heuristically determined parameters include an acuteness parameter A n , an edge parameter 
E n and a length parameter L n . The A n parameter is used for excluding insignificant contour points by defining 
a minimum acuteness A„ for bending point selection; the E n and L n parameters are used to exclude insignificant 
contour groups by defining a minimum number E n of edge points that are to be excluded from each contour 

10 group and a minimum length L n for each contour group; and the A n , E n and L n parameters are heuristically se- 
lected to be either strong, medium or weak as required for each bending point extraction iteration. The step of 
extracting selected bending points is performed in four iterations including a first iteration wherein the A„, E n 
and L n parameters are selected such that a bending point is initially extracted from each contour group having 
even weak curvature, a second iteration wherein theA n , E n and L n parameters are selected such that a bending 

15 point is extracted from contour groups having strong curvature, a third iteration wherein the A„, E„ and L n para- 
meters are selected such that a bending point is extracted from contour groups having long gentle curvatures, 
and a fourth iteration wherein the A n , E n and L n parameters are selected such that a bending point is extracted 
from contour groups of medium curvature and length. 

20 BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is a diagrammatic illustration of a picture element (pixel) array pattern representing a character image 
to be recognised; 

Fig. 2 is another diagrammatic representation of a pixel array pattern representing another character image 
25 to be recognised; 

Fig. 3 is a bending point table listing signif icant bending points of the character image represented by Fig. 

1; * ■ 

Fig. 4 is a bending point table listing significant bending points of the character image represented by Fig. 

2; 

30 Fig. 5 is a block diagram illustrating a system for optically recognising character images from a document; 

Figs. 6a-6c constitute a flow diagram of a bending point extraction method in accordance with the present 
invention; 

Fig. 7 illustrates a pixel array pattern and a mask array used for scanning the pixel array pattern; 
Fig. 7a is an enlarged view of a portion of the pixel array pattern of Fig. 7 showing the use of white and 
35 black colors in the pixel array pattern positions to represent a character image; 

Fig. 7b is an enlarged view of the mask array of Fig. 7 showing the assignment of a binary value to each 
mask array position; 

Fig. 8 is a diagrammatic illustration of a pattern matrix generated from the pixel array pattern of Fig. 7 in 
accordance with the invention; 
40 Fig. 9 is a diagrammatic illustration of a tag matrix generated from the pixel array pattern of Fig. 7 in ac- 

cordance with the invention; 

Fig. 9a is an enlarged view of the tag matrix of Fig. 9 showing the assignment of numerical values to pixel 
array pattern positions for use in contour tracing; 

Fig. 10 is an enlarged view of the mask array of Fig. 7 showing the assignment of tracing directions to 
45 different mask array configurations resulting from scanning the pixel array pattern of Fig. 7; 

Fig. 11 illustrates a first character image contour trace showing the tracing direction information provided 
by the pattern matrix values; 

Fig. 12 illustrates a first character image contour trace using the tracing control information provided by 
the tag matrix values; 

50 Fig. 1 3 illustrates an x-y coordinate listing of character image contour points resulting from a first character 

image contour trace such as that shown in Fig. 12; 

Fig. 14 illustrates second and third character image contour traces using the tracking direction information 
provided by the pattern matrix values; 

Fig. 15 illustrates second and third character image contour traces using the tracing control information 
55 provided by the tag matrix values; 

Fig. 16 illustrates x-y coordinate lists generated during second and third character image contour traces 
such as those shown in Fig 15; 

Fig. 17 illustrates a stored orientation matrix for use in determining relative orientations between coordin- 
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ates of the x-y coordinate lists of Figs. 13 and 16; 

Fig. 18 is an enlarged x-y coordinate list showing the manner in which differential x and y values may be 
obtained between contour points and input as address values to the orientation matrix of Fig. 17; 
Fig. 19 diagrammatically illustrates the assignment of orientation values to points along a character image 
5 contour; 

Fig. 20 illustrates an orientation list of orientation values assigned to character image contour points and 
generated from the x-y coordinate list of Fig. 18; 

Fig. 21 illustrates an acuteness list generated from the orientation list of Fig. 20; 

Fig. 22 is a diagrammatic illustration of a character image corresponding to the pixel array pattern of Fig. 
10 7 showing convex and concave contour groups derived from tracing contours of the character image; 

Fig. 23 is a diagrammatic illustration of the significance of various heuristically determined parameters 
used for bending point extraction in accordance with the invention; 

Fig. 24 diagrammatically illustrates the extraction of bending points as a result of three bending point ex- 
traction iterations; 

15 Fig. 25 diagrammatically illustrates the extraction of bending points as a results of four bending point ex- 

traction iterations; 

Fig. 26 is a diagrammatic illustration of a sample input character image; and 

Fig. 27 is a diagrammatic illustration of a bending point data output set generated in accordance with the 
invention. 

20 

Detailed Description of the Invention 

For any character image susceptible of recognition via automated optical character recognition procedures, 
there exists one or more bending points which may be considered unique to the class of which the character 

25 image is a member. Figs. 1-4 show examples of character image bending points. In each figure, a character 
image is represented as a picture element (pixel) array pattern of 24x18 pixels. Each pixel array pattern in- 
cludes white and black pixels with the black pixels being arranged to correspond to the character image. In 
Figs. 1 and 2, black pixels are shown by an asterisk "*" and white pixels are shown as white spaces. 

Fig. 1 illustrates the character image "A". This image has seven bending points identified by the letters 

30 "a" through "g". Fig. 2 illustrates the character image "D" and shows six bending points labeled "a" through T. 
The minimal number of bending points for any character image is usually two for images such as the number 
"1". In Figs. 3 and 4, the bending points of Figs. 1 and 2 are respectively tabularised to indicate x-y coordinate 
position, whether the bending point is convex or concave, the direction of the bending point and bending point 
acuteness. The x-y coordinates are the pixel array positions of the bending points. The convex/concave des- 

35 ignation indicates the nature of the contour curvature at the bending point. The directions of bend are quantised 
to one of eight values (e.g., top = 0, left = 2, bottom = 4, rig ht = 6, top-right = 7, etc.). The acuteness is calculated 
using an angle of two lines from the bending point to the Nth previous point and from the bending point to the 
Nth following point. The sharper the angle, the higherthe acuteness value (e.g., 9 represents very strong acute- 
ness, 1 represents very weak acuteness). 

40 The bending point extraction method of the present invention serves to generate bending point data sets 

from input character images for subsequent character recognition. In Fig. 5, a document 10 contains one or 
more character images to be recognised. It will be understood that the document 10 may include a wide variety 
of character image bearing media in many forms and configurations. For example, document 10 could be a 
letter containing alphanumeric text information ora drawing containing graphics and text information. The docu- 

45 ment 10 could also be a package or a label or tag with alphanumeric text information requiring scanning, as 
might be used, for example, on a postal package. Each input document 10 is scanned and thresholded and 
the characters are segmented using a conventional scanning, thresholding and segmenting apparatus 12. De- 
vices of this type are well known in the art and typically include a document feed mechanism, a light source, 
a lens, plural optical sensing elements arranged in a line, a thresholding circuit and a segmenting circuit. The 

50 number of optical sensing elements is typically about eight elements/mm. That is, the pixel density in the main 
scan direction is typically around 200 pixels/inch and the pixel density in the sub-scan direction perpendicular 
to the main scan direction is also around 200 pixels/inch. One optical element generates an analog signal cor- 
responding to one pixel, and this analog signal is applied to the threshold circuit. A binary "1" signal represent- 
ing a black pixel is generated when the analog signal is lower than a predetermined threshold value, and a 

55 binary "0" signal representing a white pixel is generated when the analog signal is higher than the threshold. 
The segmenting circuit separates each character image into separate character pixel array patterns 14 as 
shown in Fig. 5. The pixel array patterns 14 can be stored in frames of 24x16 pixels, for example, in an input 
storage buffer 16, which is also conventional in nature. 
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It will be understood that the bending point extraction method of the present invention is invoked at a point 
in the optical character recognition process where characters have been obtained from a document whose ob- 
jects have been scanned, thresholded and segmented and stored in the input storage buffer 16. The bending 
point extraction method may be implemented using a conventional data processing system 1 8 including a cen- 

5 tral processing unit (CPU) 20, one or more storage registers 22, local read-only and random access memory 
24, and long-term program memory 25 for storing programs including executable instruction sets for perform- 
ing the bending point extraction method of the present invention. Conventional data processing systems suit- 
able for implementing the invention include stand-alone personal computers (PCs) such as the IBM PS/2 from 
international Business Machines Corporation operating under IBM's OS/2 Operating System. (IBM, PS/2 and 

10 OS/2 are trademarks of International Business Machines corporation). Other data processing systems include 
networked PCs and work stations, as well as mid-range and mainframe platforms. In a preferred aspect, the 
bending point extraction method is software implemented, but could also include partial hardware implemen- 
tation, as discussed in more detail below. The data processing system 18 produces a bending point data set 
output which is stored in a conventional output storage device 26 utilising a tape, disk or other known permanent 

15 storage media. The bending point data set stored in the output storage device 26 is presented as an input to 
an optical character recognition system 28. It will be understood and appreciated that various optical character 
recognition systems could be utilised to receive the bending point data set generated in accordance with the 
present invention including neural network OCR systems utilising back propagation methods, as previously 
discussed. 

20 Referring now to Figs. 6a-6c, a preferred bending point extraction procedure will now be described with 

further reference being made to Fig. 7 using, by way example, a character image corresponding to the number 
"8". This number is used because it contains both exterior and interior contours, yet is not unduly convoluted 
or complex. Character image scanning is shown as occurring in step 50 of Fig. 6a to produce one or more pixel 
array patterns that are stored in the input storage buffer 16 described above. The pixel array pattern 100 of 
25 Fig. 7 is a 24x16 array of black and white picture elements representing the character image to be recognised. 
This pixel array pattern includes a plurality of array positions, including positions 102-108 shown in Fig. 7a, 
representing continuous contours of the character image. In step 52 of the bending point extraction procedure, 
the pixel array pattern 100 is input into the data processing system 18 and stored in local memory 24. 

The first goal of the bending point extraction procedure is to trace the exterior and interior continuous con- 
so tours of the character image 100 to generate a list of contour points for each contour traced. The procedure 
for tracing the contours of the character image "8" employs a 2x2 mask array 110 for scanning each line of 
the pixel array pattern 1 00 in a left to right horizontal sweep. Fig. 7b shows that the mask array 110 includes 
four array positions which are assigned binary place values of 8, 4, 2 and 1. Depending on the combination 
of black and white pixels appearing in these positions, values from zero to fifteen can be read from the mask 
35 array. 

In process step 54 of the bending point extraction procedure, the mask array 110 is used to scan every 
position of the pixel array pattern 100. This scanning results in the generation of a pattern matrix and a tag 
matrix in process steps 56 and 58 of the bending point extraction procedure. The pattern matrix and tag matrix 
are shown in Figs. 8 and 9, respectively. Each matrix includes 25x17 storage positions which are generated 
40 by virtue of the fact that the center of the mask array 110, as shown in Fig. 7, scans the pixel array pattern 
1 00 from the left edge to the right edge, a starting from the top edge of the pixel array pattern and then between 
each row of the pixel array pattern until the bottom edge is reached. In this way, each interior position of the 
pattern and tag matrices will correspond to a position representing the intersection of four positions of the pixel 
array pattern 100. 

45 Each element of the pattern matrix 120 has a value (0-15) which is a weighted sum determined from the 

values (black or white) of thefour array positions of the 2x2 mask array 1 1 0. These values are used for deciding 
a tracing direction, as discussed below. Each element of the tag matrix is also assigned a value determined 
from the values (black or white) appearing in the positions of the mask array 110. If the mask array position 
are all white or all black, a value of zero (no contour) is assigned to the corresponding tag matrix position. If 

so the mask array shows two black and two white pixels diagonally crossed) a value of two is assigned to the 
tag matrix. All other mask array value combinations result in a value of one being assigned to the tag matrix. 
The tag matrix 130 is used for tracing control such that contours are only traced once. A tag matrix value of 
two is a special case used for matrix positions that are part of two contours, as shown in Fig. 9a. 

The procedure for scanning the pixel array pattern 100 to generate the pattern and tag matrices 120 and 

55 1 30 can be advantageously implemented in software using conventional programming languages such as C 
orthe like. Additionally, process steps 54-58 could be partially implemented in hardware using the storage reg- 
isters 22 which preferably include a pair of 16-bit shift registers. In this implementation, successive line pairs 
of the pixel array pattern 100 are entered into the shift registers. By successive shifts of each register, values 
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for each position of the 2x2 mask array 110 are read sequentially and used to generate the pattern and tag 
matrices. It will also be appreciated thatthe tag matrix 130 can be generated using the values from correspond- 
ing positions of the pattern matrix 120. 

In process step 60 of the bending point extraction procedure, a first contour of the input character image 

5 is traced and an x-y list is created. In the tracing procedure, the tag matrix 130 is scanned in a left to right 
direction to find a first non-zero value indicating a contour starting point. The x-y tag matrix coordinate positions 
of the starting point are stored in an x-y coordinate list 140 as shown in Fig. 13. To determine a direction to 
the next contour point, the pattern matrix 120 is consulted at the corresponding x-y coordinate position. At this 
position, there will be stored a value between zero and fifteen corresponding to the pattern of white and black 

10 pixels contained in the mask array 110 when positioned at that point. Based on the number assigned to the 
pattern matrix storage position, a tracing direction is determined as shown in Fig. 10. Pattern matrix values of 
1, 5 and 13 indicate a downward tracing direction. Pattern matrix values of 2, 3 and 7 indicate a leftward tracing 
direction. Pattern matrix values of 8, 10 and 11 indicate an upward tracing direction. Pattern matrix values of 
4, 12 and 14 indicate a rightward tracing direction. When the pattern matrix value is 6 or 9, the direction value 

15 for the previous contour point is consulted to determine which direction to take, as shown in Fig. 1 0. It will be 
seen from Fig. 10 that the direction values assigned to the pattern matrix are intuitively correct based on the 
appearance of the corresponding mask array patterns. If the mask array is thought of as a window overlying 
the pixel array pattern, each mask array pattern will appear to correspond to a location on the contour of the 
input character image. Fig. 8, for example, illustrates locations on the character contour patterns where mask 

20 arrays having values of 1 0 and 5 would be generated. It is easy to see thatthe directions assigned to the mask 
array values will cause the character contour to be followed during the tracing process. 

For the pixel array pattern of Fig. 7, the trace starting point determined from the tag matrix 130 corre- 
sponds to a pattern matrix value of 1 . As shown in Fig. 11, the direction to the next contour point is downward. 
The remaining arrows in Fig. 11 show successive tracing directions based on the pattern matrix value at each 

25 subsequent contour point. Fig. 12 illustrates the starting contour point of the tag matrix 130. Once the x-y co- 
ordinates of the starting point are placed in the x-y list, the tag matrix value at that location is decremented 
by 1 to indicate that the contour point has been treated. The x-y coordinate of the next contour point is then 
determined from the tag matrix 1 30 by moving one position in the tracing direction determined from the pattern 
matrix 120. The x-y coordinates of the new contour point are then stored in the x-y list and the process con- 

30 tinues in similar fashion until the entire contour is traced. Fig. 13 illustrates an x-y list 140 generated by the 
tracing procedure. This trace will produce a list of x-y coordinate points defining the exterior contour of the input 
character image "8". Subsequent second and third traces are performed in similar fashion as shown in Figs. 
14, 15 and 16 and x-y lists 142 and 144 are generated for the interior contours of the number "8". Following 
generation of the x-y lists 140, 142 and 144, the pixel array pattern 110, the pattern matrix 120 and tag matrix 

35 1 30 are no longer required and may be discarded, all contour points having been identified and stored in the 
respective x-y lists. Step 62 of the bending extraction procedure illustrates testing to determine whether addi- 
tional contours remain. 

Following the identification of all character contours, the bending point extraction procedure implements 
process step 64. There, an orientation list representing the orientation directions between selected points in 

40 the x-y list is generated. As shown in Fig. 17, the orientation list can be rapidly generated using an orientation 
matrix look-up table 150. The orientation matrix 150 has plural storage positions, each storage position con- 
taining a value representing an orientation direction between a contour point and a selected subsequent contour 
point. The orientation matrix is addressable using x-y offset values representing the difference in x-y coordinate 
value between the contour point and the selected subsequent contour point. By way of example, Fig. 18 illus- 

45 trates an x-y list 152 containing a series of x-y contour point coordinate values. It is desirable to find an orien- 
tation direction from each contour point to an Nth following contour point. The number used for the threshold 
increment N may vary but satisfactory results have been achieved using N=3 for 24x16 pixel array patterns. 
For pixel array patterns with more positions, high N values could be used. Figs. 17 and 18, for example, are 
based on a threshold increment value of N=5. 

so The orientation for each contour point is thus defined as an arrow from the contour point to the Nth following 

point. It is quickly obtained using the precalcuiated orientation matrix 150, which preferably consists of 
(2N+1)x(2N+1) elements. In Fig. 8, the x-y offset values between the first contour point (x=8, y=1) in the x-y 
list 152 and the fifth following contour point (x=6, y=4) in the x-y list is the dx = -2, dy = 3. Using these values 
as addresses in the orientation matrix 150, it is seen that an orientation value ©representing an angle from a 

55 reference direction (e.g. zero degrees) to the orientation direction is quickly determined. Fig. 19 graphically 
illustrates how the orientation values relate to the exterior contour of the letter "8". The orientation values de- 
termined in step 64 of the bending point extraction procedure are stored in an orientation list 160 as shown in 
Fig. 20. In some cases, it may be desirable to perform a smoothing process as in step 66 of the bending point 
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extraction procedure to provide enhanced orientation continuity between the contour points. The smoothing 
process determines a more general angle between contour points by taking a weighted average of orientation 
values for the contour points immediately preceding and succeeding the contour point of interest. For example, 
a smoothed orientation value 0| for the ith contour point can be determined from the orientation values 0 M 
5 and © i+1 for the immediately proceeding and succeeding contour points in accordance with the following for- 
mula: 

(0 M +2©| + 0 i+1 )/4 

It is to be noted that smoothing is not generally required where the pixel array pattern is sized at 50x50 pixels 
or larger and the threshold increment N is around 7 or more contour points. 

10 Following generation of the orientation list and smoothing thereof, the bending point extraction procedure 

generates an acuteness list in process step 68. The acuteness list is shown in Fig. 21 as reference number 
170. The acuteness list is generated by determining values representing the acuteness angle of curvature be- 
tween each contour point and the Nth following contour point. The acuteness angle is readily found by finding 
the difference between the orientation values ©| and 0 i+N assigned to the i th and (i+N)th contour points. The 

15 value of N is preferably the same as that used in assigning orientation values. 

In process step 70 of the bending point extraction procedure, the acuteness list 170 is divided into one or 
more contour groups. Each contour group consists of a series of consecutive contour points that are either all 
convex or all concave in curvature. Concavity and convexity is readily determined form the angle of curvature. 
Angles of curvature that are less than or equal to 180 degrees, looking from the Nth previous point to the Nth 

2c following point are considered convex. All other angles of curvature are concave. Fig. 22 illustrates the result 
of identifying convex and concave contour point groups for the character "8" identified as reference numeral 
180 There are two convex contour groups 182 and 184. Contour groups 186, 188, 190 and 192 are all concave 
groups 

The next series of bending point extraction steps seek to extract bending points based on heuristically de- 
h Urf rmoed parameters. These parameters are determined by experiment prior to performing the bending ex- 
traction procedure. Essentially, a best initial approximation is made and the parameters are optimised to yield 
the highest possible recognition rates. Three kinds of parameters are used. Afirst parameter A n represents an 
acuteness threshold which is used to reject bending points having an angle of acuteness below the acuteness 
t hreshokj A second parameter is an edge value E n which is used to eliminate bending points within E n contour 
x rxwrta from the edge of each contour group. This parameter maintains appropriate spacing between selected 
bending poms. A third parameter L n represents a minimum length threshold that is used to reject contour 
groups having a length that is less than !_„ contour points. Fig. 23 illustrates graphically the use of the edge 
and length parameters E n and L n . Considering a contour group 200 extending between previously extracted 
b*ndmg points 204 and 206, the E n parameter eliminates bending points within E n points of each previously 
3i extracted bending point and the L n parameter eliminates the contour group if there are less than L n remaining 
pomti therein 

The extraction of significant bending points from the contour groups is performed in a series of iterations. 
For eacn iteration, there is a process step 72 wherein values for the A,,, E n and l_„ parameters are selected 
(e g from a parameter look-up table) for use in the iteration. The subscript n designates the iteration step. With 

-*f. each iteration, different parameters are used to extract different bending points. In process step 74 of the bend- 
ing point extraction procedure, a first bending point extraction step is the identification of contour groups having 
potential bending points. This step utilises the edge parameter to eliminate E n points from the ends of each 
contour group and the length parameter to eliminate contour groups having a length of less than L„ points. A 
next process step 76 finds a contour point having a maximum acuteness value in each selected contour group. 

45 In process step 78, the identified maximum acuteness point is extracted as a bending point if it is determined 
to be significant. The bending point is significant if its acuteness exceeds the acuteness threshold A„. If the 
acuteness threshold is not met, the maximum acuteness point is discarded. Following the extraction of signif- 
icant bending points from the selected contour groups, each contour group having a bending point extracted 
therefrom is divided at the bending point into one or more contour subgroups. Thereafter, if further iteration is 

so required, as determined in process step 82, the procedure returns to process step 72 and new parameter values 
are assigned for the next iteration. 

As indicated, each bending point extraction iteration seeks to identify bending points of differing signifi- 
cance. Look-up table 1 below shows an exampleof a parameter table containing parameter values forfour heur- 
istic steps: 

55 
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Step 


A 

n 


E 

n 


L 

n 


1 1 


S 


1 " 1 


- 


n 1 

2 1 


L 


1 M 1 
I " 1 




3 | 


S 


1 L | 


L 


4 I 


M 


I M I 


M 


L: Large 
M: Medium 


value 
Value 


S: Small 
Not Applied 


Value 



TABLE 1 - EXAMPLE OF PARAMETERS 

70 

Step 1 tries to extract one bending point from each contour group having even small curvatures. In Table 
1 , the acuteness parameter A„ is shown having asmall value "S" indicating that the curvature threshold is weak. 
It will also be noted that the length parameter is not assigned a value in the first iteration. This results in each 

20 contour group being treated regardless of length. Moreover, because no previous bending points have been 
extracted, there is no need to eliminate edge points to provide spacing between extracted bending points. Thus, 
the edge parameter is also not assigned a value. The second iteration of Table 1 seeks to extract bending points 
from contour groups having strong curvature. In this step, a large value "L" is assigned to the acuteness thresh- 
old A„ to accept only contour groups having strong curvature. A medium value "M" is assigned to the edge para- 

25 meter E„. This provides medium spacing between the bending point selected in the second iteration and pre- 
viously extracted bending points. No value is assigned to the length parameter L n such that length is not a 
factor in the second iteration. The third iteration of Table 1 seeks to extract bending points from long gentle 
curvatures as in the letter "O". The threshold parameter A„ is assigned a very small value "S" to accept contour 
groups having even weak curvatures. The edge parameter E n is assigned a large value "L"to provide maximum 

30 spacing between bending points, and the length parameter L n is also assigned a large value "L" to accept only 
• long contour groups. Afourth step is used in Table 1 because the first three iterations sometimes leave desir- 
able bending points. In the fourth iteration, medium values "M" are assigned to each of the parameters A n , E n 
and L n . 

Figs. 24 and 25 graphically illustrate the effects of each iteration using the parameter values of Table 1. 

35 In Fig. 24, the letter "Z" is treated. This letter, designated by reference number 210, includes initial convex con- 
tour groups 212, 214, 216 and 218. It includes initial concave contour groups 220, 222, 224 and 226. The bend- 
ing points extracted in the first iteration are shown by the reference number 1. As described, a bending point 
is extracted from each of the contour groups. These contour groups are then divided at each of the selected 
bending points. In the second iteration, which tests for strong curvatures, a single bending point "2" is extracted 

40 and the contour group containing that point is divided. In the third iteration, a bending point "3" is extracted 
from a contour group having long gentle curvature. Fig. 25 shows the letter "D", identified by reference number 
230. Initially, this character includes a single convex contour group 232 and a single concave contour group 
234. In a first bending point extraction iteration, a single bending point "1" is extracted from each contour group 
and the contour groups are divided. In a second iteration, bending points "2" having high acuteness are ex- 

45 tracted from each contour group and the contour groups are again divided at the extracted bending points. In 
a third iteration, bending point "3" is extracted from two remaining contour groups having long gentle curva- 
tures. Finally, in a fourth iteration, remaining bending points "4" are extracted. 

The effect of using the parameters from Table 1 on character recognition rate can be determined from the 
optical character recognition rates achieved. If desired, the parameters can be adjusted or more iteration steps 

50 added and effective values can be determined on a trial and error basis. Once a set of optimised parameter 
values are obtained, it can be expected that desirable bending points will be extracted that match well with 
human intuition. 

In process step 84 of the bending point extraction procedure, a bending point output data set is generated. 
This data set contains the information shown below in Table 2: 
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The bending point information listed in Table 2 corresponds to a Japanese Katakana (phonetic) character which 
can be graphically output for operator confirmation as shown in Figs. 26 and 27. In Table 2 n is the number of 
bending points in the current contour, x and y are the coordinates of the bending points, d is the direction of 

35 the bending point and c is the acuteness of the bending point. In Fig. 26, the strokes of a character are shown 
as "0". In Fig. 27, extracted bending points are shown as asterisks "*". In addition, each convex point is asso- 
ciated with a number from 1 to 9 representing the acuteness value. Each concave point is assigned a letter 
from "a" to "i" also representing acuteness (i.e., a=-1 . . . i=-9). The resultant data set includes an identification 
of all character contours, the number of contour points in each contour, the number of convex/concave contour 

40 groups in each contour, and an identification of all bending points in the character image including the x-y bend- 
ing point coordinates, the contour acuteness at each bending point (positive values are convex, negative values 
are concave) and the contour orientation at each bending point. This data set can be passed to a selected opt- 
ical character recognition program for character recognition. 

Thus, an automated bending point extraction method and apparatus for optical character recognition has 

45 been shown and described. Advantageously, bending points may be quickly extracted in comparison to prior 
art methods by using table look-up techniques. By analysing desired bending points, a heuristic method is de- 
signed and essential parameters are developed. The parameters have different values for each heuristic step. 
This enables the flexible extraction of bending points for various curvatures, small/large and gentle/sharp. By 
repeating the bending point extraction step on a trial and error basis, users can select optimum values of the 

so parameters to get bending points which match well with human intuition. The bending point extraction method 
of the present invention is believed to be competitive with more complex prior art algorithms. Using simple 
values of parameters for each heuristic step, bending points can be flexibly extracted for various curvatures. 
Because the method utilises table look-up techniques, processing speed is reduced. 
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Claims 



1. A method for identifying bending points in a character image for use as input to an optical character rec- 
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ognition procedure (28), the method comprising the steps of: 

inputting (52) a picture element (pixel) array pattern (14) of black and white picture elements rep- 
resenting a character image to be recognised, the pixel array pattern (14) including a plurality of array 
positions representing continuous contours of the character image; 

scanning (54) the pixel array pattern (14) to trace (60, 62) one or more continuous contours of the 
character image and generating (60, 62) a list of contour points for each traced contour; 

determining for each contour point an acuteness value representing an angle of contour curvature 
and generating (68) a list of acuteness values for each traced contour; 

dividing (70) each acuteness list into contour groups, each contour group having a series of con- 
secutive points that are either all convex or all concave in curvature; 

extracting (74, 76, 78, 80) selected bending points from one or more contour groups using heurist- 
ically determined (72) parameters in one or more iterations (82); and 

generating (84) a bending point data set output including a list of character bending points, their 
orientation and their acuteness. 

A method as claimed in claim 1 wherein the step (60) of tracing one or more contours of the character 
image includes identifying a tracing starting point on each contour and determining a direction to a next 
contour point based on the position of the tracing starting point in the contour. 

A method as claimed in claim 1 wherein the step (60) of tracing one or more contours of the character 
image includes identifying a tracing starting point on each contour and determining a direction to a next 
contour point based on the value of adjacent positions in the pixel array pattern (14). 

A method as claimed in any preceding claim wherein the step (60) of tracing one or more contours of the 
character image includes assigning a tracing direction value to each contour point representing a direction 
to a next contour point, said step of assigning a tracing direction value including generating a pattern matrix 
having plural storage positions corresponding to the contour points, said pattern matrix being generated 
by scanning the pixel array pattern (14) with a mask array, determining the number and position of black 
and white pixels appearing in the mask array, and assigning direction values to the pattern matrix storage 
positions based on the information determined from the mask array, said mask array being a two by two 
element array capable of indicating a total of fourteen pixel combinations which are used to assign a total 
of four tracing directions. 

A method as claimed in any preceding claim wherein the step (60) of tracing one or more contours of the 
character image includes maintaining tracing control information indicating contour points that remain eli- 
gible for tracing. 

A method as claimed in any preceding claim wherein the step (60) of tracing one or more contours of the 
character image includes assigning a value to each contour point representing the number of times the 
contour point remains eligible for tracing. 

A method as claimed in claim 4 wherein the step (60) of tracing one or more contours of the character 
image includes generating a tag matrix having plural storage positions corresponding to the contour 
points, each tag matrix storage position containing a value indicating the number of times an associated 
contour point remains eligible for tracing and wherein the tag matrix is generated using the pattern matrix 
by assigning a value to each tag matrix storage position based on a value in a corresponding storage pos- 
ition of the pattern matrix. 

A method as claimed in any preceding claim wherein the step (84) of generating a list of contour points 
includes generating a list of x-y coordinate values for each contour point. 

A method as claimed in any preceding claim wherein the step of determining an acuteness value for each 
contour point includes: 

generating a list of orientation values representing orientation directions from each contour point to a se- 
lected subsequent contour point, and wherein the acuteness value for each contour point is the angular 
difference in orientation value between the contour point and the selected subsequent contour point, 
wherein the step of generating a list of orientation values includes using an orientation matrix having plural 
storage positions, each storage position containing a value representing an orientation direction and being 
addressable using x-y offset values representing a difference in x-y position between each contour point 
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and a selected subsequent contour point; and 

wherein the step of determining an acuteness vajue for each contour point further includes smoothing 
the orientation values to provide enhanced orientation continuity between each contour point and a se- 
lected subsequent contour point. 

A method as claimed in any preceding claim wherein following the extraction of a bending point from each 
contour group, the contour group is divided at the bending point into one or more contour subgroups and 
the step of extracting selected bending points is repeated for each contour subgroup. 

A method as claimed in any preceding claim wherein the heuristically determined (72) parameters include 
an acuteness parameter A n , an edge parameter E n and a length parameter L n . 

A method as claimed in claim 11 wherein: 

the A n parameter is used for excluding insignificant contour points by defining a minimum acuteness A n 
for bending point selection; 

the E n and L n parameters are used to exclude insignificant contour groups by defining a minimum number 
E n of edge points that are to be excluded from each contour group and a minimum length L n for each con- 
tour group; and 

the A n , E n and L n parameters are heuristically selected to be either strong, medium or weak as required 
for each bending point extraction iteration. 

A method as claimed in claim 12 wherein said step of extracting selected bending points is performed in 
four iterations including a first iteration wherein the A n , E n and L n parameters are selected such that a bend- 
ing point is initially extracted from each contour group having even weak curvature, a second iteration 
wherein the A n , E„ and L n parameters are selected such that a bending point is extracted from contour 
groups having strong curvature, a third iteration wherein the A n , E n and L n parameters are selected such 
that a bending point is extracted from contour groups having long gentle curvatures, and a fourth iteration 
wherein the A n , E n and L n parameters are selected such that a bending point is extracted from contour 
groups of medium curvature and length. 
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